在异构机器人网络上进行计算负载共享是一个有希望的方法,可以将机器人能力和效率作为极端环境中的团队提高。然而,在这种环境中,通信链路可以是间歇性的,并且与云或因特网的连接可能是不存在的。在本文中,我们介绍了用于多机器人系统的通信感知,计算任务调度问题,并提出了整数线性程序(ILP),该程序(ILP)优化了异构机器人网络中的计算任务分配,占网络机器人的计算能力对于可用(和可能的时变)通信链接。我们考虑调度由依赖关系图建模的一组相互依赖的必需任务和可选任务。我们为共享世界,分布式系统提供了一项备份的调度架构。我们验证了ILP制定和不同计算平台中的分布式实现,并在模拟场景中,偏向于月球或行星探索方案。我们的研究结果表明,与没有计算负载共享的类似系统相比,所提出的实施方式可以优化提高时间表以允许三倍增加所执行的奖励任务的数量(例如,科学测量)。
translated by 谷歌翻译
In intensively managed forests in Europe, where forests are divided into stands of small size and may show heterogeneity within stands, a high spatial resolution (10 - 20 meters) is arguably needed to capture the differences in canopy height. In this work, we developed a deep learning model based on multi-stream remote sensing measurements to create a high-resolution canopy height map over the "Landes de Gascogne" forest in France, a large maritime pine plantation of 13,000 km$^2$ with flat terrain and intensive management. This area is characterized by even-aged and mono-specific stands, of a typical length of a few hundred meters, harvested every 35 to 50 years. Our deep learning U-Net model uses multi-band images from Sentinel-1 and Sentinel-2 with composite time averages as input to predict tree height derived from GEDI waveforms. The evaluation is performed with external validation data from forest inventory plots and a stereo 3D reconstruction model based on Skysat imagery available at specific locations. We trained seven different U-net models based on a combination of Sentinel-1 and Sentinel-2 bands to evaluate the importance of each instrument in the dominant height retrieval. The model outputs allow us to generate a 10 m resolution canopy height map of the whole "Landes de Gascogne" forest area for 2020 with a mean absolute error of 2.02 m on the Test dataset. The best predictions were obtained using all available satellite layers from Sentinel-1 and Sentinel-2 but using only one satellite source also provided good predictions. For all validation datasets in coniferous forests, our model showed better metrics than previous canopy height models available in the same region.
translated by 谷歌翻译
Advances in computer vision and machine learning techniques have led to significant development in 2D and 3D human pose estimation from RGB cameras, LiDAR, and radars. However, human pose estimation from images is adversely affected by occlusion and lighting, which are common in many scenarios of interest. Radar and LiDAR technologies, on the other hand, need specialized hardware that is expensive and power-intensive. Furthermore, placing these sensors in non-public areas raises significant privacy concerns. To address these limitations, recent research has explored the use of WiFi antennas (1D sensors) for body segmentation and key-point body detection. This paper further expands on the use of the WiFi signal in combination with deep learning architectures, commonly used in computer vision, to estimate dense human pose correspondence. We developed a deep neural network that maps the phase and amplitude of WiFi signals to UV coordinates within 24 human regions. The results of the study reveal that our model can estimate the dense pose of multiple subjects, with comparable performance to image-based approaches, by utilizing WiFi signals as the only input. This paves the way for low-cost, broadly accessible, and privacy-preserving algorithms for human sensing.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Static subword tokenization algorithms have been an essential component of recent works on language modeling. However, their static nature results in important flaws that degrade the models' downstream performance and robustness. In this work, we propose MANTa, a Module for Adaptive Neural TokenizAtion. MANTa is a differentiable tokenizer trained end-to-end with the language model. The resulting system offers a trade-off between the expressiveness of byte-level models and the speed of models trained using subword tokenization. In addition, our tokenizer is highly explainable since it produces an explicit segmentation of sequences into blocks. We evaluate our pre-trained model on several English datasets from different domains as well as on synthetic noise. We find that MANTa improves robustness to character perturbations and out-of-domain data. We then show that MANTa performs comparably to other models on the general-domain GLUE benchmark. Finally, we show that it is considerably faster than strictly byte-level models.
translated by 谷歌翻译
In Novel Class Discovery (NCD), the goal is to find new classes in an unlabeled set given a labeled set of known but different classes. While NCD has recently gained attention from the community, no framework has yet been proposed for heterogeneous tabular data, despite being a very common representation of data. In this paper, we propose TabularNCD, a new method for discovering novel classes in tabular data. We show a way to extract knowledge from already known classes to guide the discovery process of novel classes in the context of tabular data which contains heterogeneous variables. A part of this process is done by a new method for defining pseudo labels, and we follow recent findings in Multi-Task Learning to optimize a joint objective function. Our method demonstrates that NCD is not only applicable to images but also to heterogeneous tabular data.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
Diffusion models have achieved unprecedented performance in generative modeling. The commonly-adopted formulation of the latent code of diffusion models is a sequence of gradually denoised samples, as opposed to the simpler (e.g., Gaussian) latent space of GANs, VAEs, and normalizing flows. This paper provides an alternative, Gaussian formulation of the latent space of various diffusion models, as well as an invertible DPM-Encoder that maps images into the latent space. While our formulation is purely based on the definition of diffusion models, we demonstrate several intriguing consequences. (1) Empirically, we observe that a common latent space emerges from two diffusion models trained independently on related domains. In light of this finding, we propose CycleDiffusion, which uses DPM-Encoder for unpaired image-to-image translation. Furthermore, applying CycleDiffusion to text-to-image diffusion models, we show that large-scale text-to-image diffusion models can be used as zero-shot image-to-image editors. (2) One can guide pre-trained diffusion models and GANs by controlling the latent codes in a unified, plug-and-play formulation based on energy-based models. Using the CLIP model and a face recognition model as guidance, we demonstrate that diffusion models have better coverage of low-density sub-populations and individuals than GANs. The code is publicly available at https://github.com/ChenWu98/cycle-diffusion.
translated by 谷歌翻译
生成模型(例如gan和扩散模型)以无监督的方式学习潜在的数据分布。但是,许多感兴趣的应用都需要从生成模型的输出空间的特定区域或在一系列特征范围内进行采样。为了允许在这些情况下进行有效的采样,我们提出了生成视觉提示(提示),这是一个通过合并任意现成模型的知识来对预训练的生成模型进行分配控制的框架。 Prestgen将控制定义为基于能量的模型(EBM),并通过使用可逆的神经网络近似EBM来以馈送方式进行示例图像,从而避免了推理时的优化。我们演示了提示如何使用各种出现的模型来控制多种生成模型(例如,stylegan2,stylenerf,styLenerf,bixfusion autocoder和nvae):(1)使用剪辑模型,提示可以通过文本引导的示例图像,(2)使用图像分类器,提示器可以在一组属性上脱离偏差的生成模型,并且(3)使用反图形模型,提示器可以在不同姿势中示例相同身份的图像。 (4)最后,Prestgen揭示了剪辑模型在用作控制时显示“报告偏差”,并且提示器可以以迭代方式进一步偏离此受控分布。我们的代码可在https://github.com/chenwu98/generative-visual-prompt上找到。
translated by 谷歌翻译
在本文中,我们介绍Bayesldm,这是一个用于贝叶斯纵向数据建模的系统,该系统由高级建模语言组成,具有针对复杂的多变量时间序列数据建模的特定功能,并与编译器相结合,可以生成优化的概率程序代码,以在指定模型中执行指定的推理。 Bayesldm支持贝叶斯网络模型的建模,其特定关注动态贝叶斯网络(DBN)的高效,声明性规范。 Bayesldm编译器将模型规范与可用数据和输出代码相结合,用于执行贝叶斯推断,以同时处理丢失的数据,同时处理未知模型参数。这些功能有可能通过抽象产生计算有效的概率推断代码的过程来显着加速域中的迭代建模工作流,这些迭代建模工作流程涉及复杂纵向数据的分析。我们描述了Bayesldm系统组件,评估表示和推理优化的效率,并提供了该系统在分析异质和部分观察到的移动健康数据的应用示例。
translated by 谷歌翻译